skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Richerson, Elizabeth"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Building speech emotion recognition (SER) models for low-resource languages is challenging due to the scarcity of labeled speech data. This limitation mandates the development of cross-lingual unsupervised domain adaptation techniques to effectively utilize labeled data from resource-rich languages. Inspired by the TransVQA framework, we propose a method that leverages a shared quantized feature space to enable knowledge transfer between labeled and unlabeled data across languages. The approach utilizes a quantized codebook to capture shared features, while reducing the domain gap, and aligning class distributions, thereby improving classification accuracy. Additionally, an information loss (InfoLoss) mechanism mitigates critical information loss during quantization. InfoLoss achieves this goal by minimizing the loss within the simplex of posterior class label distributions. The proposed method demonstrates superior performance compared to state-of-the-art baseline approaches. Index Terms: Speech Emotion Recognition, Cross-lingual Unsupervised Domain Adaptation, Discrete Features, InfoLoss 
    more » « less
    Free, publicly-accessible full text available August 17, 2026
  2. Unsupervised domain adaptation offers significant potential for cross-lingual speech emotion recognition (SER). Most relevant studies have addressed this problem as a domain mismatch without considering phonetical emotional differences across languages. Our study explores universal discrete speech units obtained with vector quantization of wavLM representations from emotional speech in English, Taiwanese Mandarin, and Russian. We estimate cluster-wise distributions of quantized wavLM frames to quantify phonetic commonalities and differences across languages, vowels, and emotions. Our findings indicate that certain emotion-specific phonemes exhibit cross-linguistic similarities. The distribution of vowels varies with emotional content. Certain vowels across languages show close distributional proximity, offering anchor points for cross-lingual domain adaptation. We also propose and validate a method to quantify phoneme distribution similarities across languages. 
    more » « less
    Free, publicly-accessible full text available August 17, 2026